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The main purpose of this work is to estimate the regression function of 
a real random variable with functional explanatory variable by using a 
recursive nonparametric kernel approach. The mean square error and 
the almost sure convergence of a family of recursive kernel estimates of 
the regression function are derived. These results are established with 
rates and precise evaluation of the constant terms. Also, a central 
limit theorem for this class of estimators is established. The method 
is evaluated on simulations and a real data set study. 
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1 Introduction 

Functional data analysis is a branch of statistics that has been the object of 
many studies and developments these last years. This kind of data appears 
in many practical situations, as soon as one is interested on a continuous phe- 
nomenon for instance. For this reason, the possible application fields propi- 
tious for the use of functional data are very wide: climatology, economics, lin- 
guistics, medicine, . . . Since the pioneer works ( |Ramsay and Dalzell(1991)| , 



|Frank and Friedman (1993)| ), many developments have been investigated, 

in order to build theory and methods around functional data, for instance 
how it is possible to define the mean, or the variance of functional data, 
what kind of model it is possible to consider with functional data, and so on 
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. . . These papers also highlight the drawback of a mere use of multivariate 
methods with this kind of data, and on the contrary suggest to consider 
these data as objects belonging to some functional space. The monographs 
of |Ramsay and Silverman(2005) , |Ramsay and Silverman(2002)| present an 



overview on both theoretical and practical aspects of functional data analy- 
sis. 

One of the most studied models in this functional setting is the regression 
model when the variable of interest Y is real and the covariate X belongs to 
some functional space £, endowed with a semi-norm || • || . Then, the regression 
model writes 

Y = r(X) + e, (1) 

where r : £ — > R is an operator and e is an error random variable. Many 
works have been done around this model when the operator r is supposed 
to be linear, contributing to the popularity of the so-called functional lin- 
ear model. In this linear context, the operator r writes (a, .) where (., .) 
stands for an inner product of the space £ and a belongs to £ . The goal is 
then to estimate the unknown function a. We refer the reader for instance 
to the wor ks of jCardot et al.(2003)| or jCrambes et al.(2009)| f or different 



methods to estimate a. Another way to approach the model (jlj) is to think 
in a nonparametric way. This direction has also been investigated by many 
authors. Recent advances on the topic have been the object of monographs 
by Ferraty and Vieu(2006)| , |Ferraty and Romain(2010)| , giving theoretical 



and practical properties of a kernel estimator of the operator r. More pre- 
cisely, if (Xi, Yi) i=1 is a sample of independent and identically distributed 
couples with the same law as (X,Y), this kernel estimator is defined, for all 
X e £, by 
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where K is a kernel and h > is a bandwidth. In the dependent case, 
|Masry(2005)| have considered the asymp totic normality of while the 



almost sure convergence was obtained by Ling and Wu (2012) . This non- 
parametric regression estimator raises several problems, as the choice of the 
semi-norm ||-|| of the space £ , the choice of the bandwidth, . . . Concerning 
the bandwidth, when the covariate is real, many solutions has been consid- 
ered, like for instance cross validation. Recently, in the multivariate setting, 
|Amiri(2012)] studied an estimator using a sequence of bandwidths that al- 
lows to compute this estimator in a recursive way, generalizing previous 
works of |Devroye and Wagner(1980)| , | Ahmad and Lin(1976)| . This esti- 
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mator shows good theoretical properties, from the point of view of mean 
square error and almost sure convergence. It also have some practical inter- 
ests: for instance, it presents a computational gain of time when one wants to 
predict new values of the variable of interest when new observations appear. 
It is not the case for the basic kernel estimator which has to be computed 
again on the whole sample. The purpose of this work is to adapt the recur- 
sive estimator studied in |Amiri(2012)| to the case where the covariate is of 
functional nature. 

The remainder of the paper is organized as follows. In section 2, we define 
the recursive estimator of the operator r when the covariate X is functional 
and we present the asymptotic properties of this estimator. In section 3, we 
evaluate the performances of our estimator with a simulation study and the 
treatment of a real dataset. Finally, the proofs of the theoretical results are 
postponed to section 4. 



2 Functional regression estimation 
2.1 Notations and assumptions 

Let (X, Y) be a pair of random variables defined in (0, A, P) , with values on 
£ xR, where £ is a Banach space endowed with a semi-norm || • || . Assume that 
(X{, l*)j=i n is a sample of n random variables independent and identically 
distributed, having the same distribution as (X,Y). The model ([I]) is then 
rewritten as 

Yi = r(Xi) + £i, i = l,...,n, 

where for any i = 1, ... ,n, £j is a random variable such that E(ej|Afj) = 
and E(ef\Xi) = a 2 £ {Xi) < oo. 

Nonparametric regression aims to estimate the functional r(x) '■= E (Y\X = x) , 
for x £ £■ To this end, let us consider the family of recursive estimators in- 
dexed by a parameter I G [0, 1], and defined by 
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where K is a kernel, (h n ) a sequence of bandwidths and F the cumulative 
distribution function of the random variable \\x — X\\. Our family of esti- 
mators is a recursive modification of the estimate defined in ^ and can be 
computed recursively by 
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with 



(x) = l ^— n , fH ] (x) = l ^— n , (3) 

£ Fit*)*-* E W*- 

i=l i=l 

andtff(-) := -i * 

J=l 

More precisely, rjf (x) is the adaption to the functional model of the finite- 
dimensional recursive family of estimators introduced by |Amiri(2012)| , which 
includes the famous ones, say recursive (£ = 0) and semi recursive [l = 1) 
estimators. The recursive property of this class of estimators is clearly useful 
in sequential investigations and also for large sample size, since addition of a 
new observation means the non-recursive estimators must be entirely recom- 
puted. Besides, we are required to store extensive data in order to calculate 
them. 



We will assume that the following assumptions hold. 



HI The operators r and al are continuous on a neighborhood of and 



F(0) = 0. Moreover, the function <p(t) := E \{r{X) 
is assumed to be derivable at t = 0. 



r(x)}\\\X-x\\=t] 



H2 K is nonnegative bounded kernel with support on the compact [0, 1] 

such that inf Kit) > 0. 
te[o,l] 

H3 For any s £ [0, l],T h (s) := 



F(hs) 
F(h) 



to(s) <c oo as h — y 0. 



1 n 

H4 (i) h n -> 0,nF(h n ) -> oo and A n p := - V" 
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oo as n — > oo. 



(ii) Vr < 2, £? n , r := - V 



i=l 



F(h r , 



F{h n ) 
/3r r i < oo, as n — > oo. 



Assumptions HI, H2 and the first part of H4 are classical in nonparametric 
regression. They have been used by |Ferraty et al.(2007)| and are the same 
as those classically used in the finite-dimensional setting. The conditions 
A n ^ — > aim < oo and H4(ii) are particular to the recursive problem and 
they are also the same as the ones used in the finite-dimensional case. Note 
that F plays a crucial role in our calculus, its limit at zero, and for a fixed x 
is known as 'small ball' probability. Then, before announcing our results, let 
us give typical examples of bandwidths and small ball probabilities satisfying 
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H3 and H4 (see |Ferraty et al.(2007)| for more details). 
If X is fractal (or geometric) process, then the small ball probabilities are 
of the form F{t) ~ C ^ K ' wnere cL and k are positive constants, and || • || 
may be a supremum norm, an LP norm or a Besov norm. In this case, the 
choice of bandwidth h n = An~ s with A > and < 5 < 1 implies that 
F(h n ) = c' x n~ SK , > 0. Then, H3 and H4 hold as soon as 5k < 1. Indeed, 
assumption H3 and the first part of H4 are clearly unrestrictive, since they 
are the same as those used in the non-recursive case. Concerning H4(ii), if 

5nr < 1, then ^ i~ Snr ~ i~x~ i so that, the condition is satisfied as soon as 
i=l KT 

= l-Snr • r ^^ ie same argument is also valid for A n £, if max{fjr, 1 + k(1 — 
£)} < l/s/ 

2.2 Main results 

As in Ferraty et al.(2007)| , let us introduce the following notations: 



M = K(l)- [ (sK(s))'T (s)ds, M 1 = K{1)- I K'( S )r (s)ds, 
Jo Jo 



M 2 = K 2 (l)- [\K 2 (s)yr (s)d S . 
Jo 



Now, we can establish the asymptotic mean square error of our recursive 
estimate. 



Theorem 1 Under the assumptions HI — H4, 



E 



Var 



r|?(x) 



^(o)^_^Mi + °(i)] + o 
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[l + o(l)]. 



Theorem [T] is an extension of the work of |Ferraty et al,(2007)| to the class of 
recursive estimators. Using the bias-variance representation, with the help of 
an additional condition, the asymptotic mean square error of our estimators 
is established in the following result. 

Corollary 1 Assume that the assumptions of Theorem^hold. If there exists 
a constant c > such that ni ? (/i n )/i^ — > c, as n — >■ oo, then 



lim nF(h n )E 



r [ n\x) -r(x) 



P[i-2i] M 2 a 2 £ ( X ) , c > ^(0) 2 M 2 



Mf 
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In particular, if X is fractal (or geometric process) with F(t) ~ C ^ K ' then 
i_ 

the choice h n = An «+ 2 7 A, k > 0, implies that 



lim n 2 + K E 

n— >oo 



r [ n ] (x)-r(x) 



+ 



In the finite-dimensional setting and for continuous time processes, a similar 



result was established by Bosq and Cheze-Payaud(1999).| for the Nadaraya- 
Watson estimator. 

To get the almost sure convergence rate of our estimator, we will assume 
that the following additional assumptions hold. 

H5 There exist A > and u > such that E [exp (A|y| M )] < oo. 



H6 lim W - F ( fe ")( ln ") M = oo f or some a > and lim (Inn) v F(hn) 
0. 



Assumption H5 is clearly checked if Y is bounded and implies that 
E ( max \Y,\ P J = 0[(lnn) p /^], \fp > l,n > 2. 

\l<i<n J 



(4) 



Indeed, if we set M 



if p > \i 
else 



one may write: 



E ( max \Yi\A <M p + e( max \Yi\n {lY , >M] ) . 

\l<i<n J \l<i<n 11 1 ! J 



Since for all p > 1, the function x \-t (lnx) P// ^ is concave down on the 
set ]max{l,exp(^ — l)},+c 
assumption H5, imply that: 



set ]max{l,exp(^ — l)},+oo[, then Jensen's inequality, with the help of 



E max \Y\ P 1 



Ki<n 



{\Yi\>M} 



< 



< 



In E exp ( A max |yj|^l{|y.| >M } 



In £E exp (AW) 
L i=l 



p/a» 



and follows. An example of sequence of random variables Y{ satisfying H5 
(and then Q) is the standard gaussian, with A = 1 and a = 2. Relation (|4|) 
have been used in the multivariate framework by Bosq and Cheze-Payaud(1999). 



in order to establish the optimal quadratic error of the Nadaraya- Watson es- 
timator. Assumption H6 is satisfied as soon as X is fractal or non smooth, 

2 

while the condition lim F(h n )(lnn) m = is not necessary when [i > 2. 
n— yoo 

Now, we can write the following theorem for our estimator of the regression 
operator. 
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Theorem 2 Assume that HI — H6 hold. If lim nh n = 0, then 

n— >+oo 



lim sup 

n— >oo 



nF(h n ) 
In In n 



V2 r 



[W[l-2l]V 2 e(x)M 2 



1/2 



a.s. 



The choices of bandwidths and small ball probabilities given previously 
are typical examples satisfying the condition lim nh 2 n = 0. The case I = 1 

of Theorem [2] is an extension to the functional setting of the result of 
|Roussas(1992)] concerning the almost sure convergence of Devroye-Wagner's 
estimator. Note that in the non recursive framework, the rate of convergence 



obtained is of the form 



nF(h Ti 



Inn 



1/2 



(see Lemma 6 . 3 in |Ferraty and Vieu(2006)| ) . 

Also conversely to the non recursive case, the rate of convergence of the re- 
cursive estimators are obtained with exact upper bounds. 



To get the asymptotic normality, we will suppose the following additional 
assumption, which is clearly verified by the choices of bandwidths and small 
ball probabilities given above. 



H7 For any 5 > 0, lim 



(Inn) 



y/nF(hr 



0. 



Theorem 3 Assume that HI — H5 and H7 hold. If there exists c > such 
that lim h n y/nF(h n ) = c, then 



c^L^V(o), 



This result is similar to the one obtained by Ferraty et al.(2007)| in the non 
recursive case. Let us mention that, the choices of bandwidths and small 

ball probabilities given above imply that ^il~ 2 ^ < 1. Then, the recursive es- 

P[i-e] 

timators are more efficient than classical estimators, in the sense that their 
asymptotic variance is small. 

In practice, if we need to construct confidence bands for the regression func- 
tion r, the constants involved in Theorem [3] need to be estimated. In partic- 
ular, as mentioned in |Ferraty et al.(2007)| , if we choose the simple uniform 
kernel, we can find explicit values of the constants M\ and M.2- About con- 
ditional variance cr^(x) it may be estimated by mean of the functional kernel 
regression technique since it can be rewritten as 



a 2 M = E(y 2 \X = X ) ~ (E(Y\X = X )f 



7 



3 Simulation study and real dataset example 



In order to see the behavior of our recursive estimator in practice, we con- 
sider in this section a simulation study. We simulate our data in the following 
way. The curves X\, . . . , X n are standard Brownian motions on [0, 1], with 
n = 100. Each curve is discretized into p = 100 equidistant points on [0, 1]. 

The operator r is defined by r(x) = / xi s ) 2 ds. The error e is simulated 

Jo 

as a gaussian random variable with mean and standard deviation 0.1. The 
simulations are repeated 500 times in order to compute the prediction errors 
for a new curve x, a ^ so simulated as a standard Brownian motion on [0, 1]. 

In our functional context, our estimator depends on the choice of many 
parameters: the semi-norm ||-|| of the functional space 8, the sequence of 
bandwidths (h n ), the kernel K, the parameter £ and the distribution function 
F in the case £ ^ 0. Since the choice of the kernel K is not crucial, we use the 
quadratic kernel, defined by K(u) = (l — u 2 ) l^i] (it) for all u G R, which 
is known to behave correctly in practice, and easy to implement. About the 
distribution function F, we estimate it by the empirical distribution function, 
which is known to be uniformly convergent. 



3.1 Choice of the bandwidth 

In this simulation, the semi-norm is based on the principal components anal- 
ysis of the curves, keeping 3 principal components (see | Besse et al.(1997)| 
for a description of this semi-norm), while £ is fixed equal to 0. We will 
see below that this parameter £ is not much influent in the behavior of the 
estimator. 

We choose to take a sequence of bandwidths hi = C max \\Xi — \\\ i~ v , 

i=l,...,n 

for i = l,...,n, with C G {0.5,1,2,10} and v G ^hhbhh h 1 }- 

At the same time, we also compute the estimator ^ introduced by [Fe rraty and Vieu(2006)| . 



Following |Rachdi and Vieu(2007)| , we introduce an automatic selection of 
the bandwidth, with a cross validation procedure. We use this procedure for 
the estimator of |Ferraty and Vieu(2006)|. For our recursive estimator, we 
denote hi = hi(C,v) with C G {0.5,1,2,10} and v G |, \, \, \, |, |, l}, 
and we consider the cross validation criterion 



_. n 

CV(C,v) = lY,( Y *-^ [ - t] m)' 



n . 

where r|f^ ^ represents the recursive estimator of r using the (n— l)-sample 
corresponding to the initial sample without the z th observation (Xi,Yi), for 
i = 1, . . . ,n. Then we select the values Ccv an d u cv of C and v that mini- 
mize CV(C, u), and our estimator is r$ using these selected values of C and 
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V. 



Table 1 presents the mean and standard deviations of the prediction error 
over 500 repeated simulations, for the optimal values of C and v with respect 
to the CV criterion (these optimal values are Cqv = 1 an d vcv = 1/10 for 
our estimator). More precisely, denoting = r^'^(x^) the predicted 
value at the j th iteration of the simulation (j = 1 , . . . , 500) for a new curve 

, we give the mean (MSPE) and the standard deviations of the quantities 

^yU] _ ybl^ . The errors are computed for our estimator (label (1) in 
the table) and the estimato r from |Ferraty and Vieu(2006)| (label (2) in the 



table), both adapted with Rachdi and Vieu(2007) procedure. We can see 
on these results that the estimator from |Ferraty and Vieu(200 6)| is a little 
better than our estimator for the MSPE criterion. As we will see later (see 



subsection 3.4), the advantage of our estimator is from the point of view of 
computational time. We also look at the behaviour of the prediction errors 
when the sample size increases: we took n = 100, n = 200 and n = 500: as 
expected, the errors decrease when the sample size increases. 

n = 100 n = 200 n = 500 

IT) 0.3022 0.2596 0.1993 

(0.6887) (0.6275) (0.5430) 

~(2) 0.2794 0.2143 0.1368 

(0.5512) (0.5055) (0.4208) 

Table 1: Mean and standard deviation of the square prediction error, com- 
puted on 500 repeated simulations, for different values of n, with the optimal 
values of bandwidth given from Ccv an< l U CV- 



3.2 Choice of the semi-norm 

In this simulation, the parameter I is fixed equal to and we choose to take 
a bandwidth h; = max 

\\*i ~x\\i~ 1/W - The aim is now to compare the 

i=l,...,n 

influence of the choice of the semi-norm, considering the following ones: 

• the semi-norm [PC^4] based on the principal components analysis of 
the curves, keeping q = 3 principal components, more precisely 



\ X i ~ X\\PCA 



i 



^2(Xi - X,Vj) 2 , 
j'=i 



where (., .) is the usual inner product of the space of square integrable func- 
tions and (vj) is the sequence of eigenfunctions of the empirical covariance 
operator T n defined by T n u := ^ X^i^' u ) u ' 
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• the semi-norm [FOU] based on a decomposition of the curves in a 
Fourier basis, with 6=8 basis functions, more precisely 



\ X i X\\fOU 



\ 3=1 



where (aXi,j) an d (d x j) are the coefficients sequences of respective Fourier 
approximations of the curves Xi and x> 

• the semi-norm [DERIV] based on a comparison of cubic splines ap- 
porximations of the second derivatives of the curves, (with a number of 
interior knots k = 8 for the cubic splines), more precisely 



\\ x i ~ xWderiv = V \ x i~ X, x i - x), 

where Xi and x are t ne spline approximations of the curves Xi and x, 

• the semi-norm [PLS] where the data are projected on a space deter- 
mined by a PLS regression on the curves, taking K = 5 PLS basis functions, 
more precisely 



\ X i ~ X\\pLS 



\ 



K 

^2( X i ~ X,Pj} 2 , 



where (pj) is the sequence of PLS basis functions. 

The results are given in Table 2. For these simulated data, the semi- 
norms [PC A] and [PLS] show better results. However, as pointed out in 
Ferraty and Vieu(2006)| , there is no universal norm that would overcome 



the others. The choice of the semi-norm depends on the data to be treated. 

norm [PC A] [FOU] [DERIV] [PLS] 
MSPE 0.3936 0.4506 0.4527 0.3887 
(1.5190) (1.5624) (1.5616) (1.5098) 

Table 2: Mean and standard deviation of the square prediction error, com- 
puted on 500 repeated simulations, for different choices of norms. 



3.3 Choice of the parameter £ 

In this simulation, we choose to take hi = max \\Xi - x\\ i~ 1/W and the 

i=l,...,n 

semi-norm based on the principal components analysis of the curves, keep- 
ing 3 principal components. The parameter i is varying into {0, j, ^, |, l}. 
The results are given in Table 3. We can see that the values of the MSPE 
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criterion are really close, so this parameter does not seem to have an impor- 
tant influence on the quality of the prediction, even if we observe as in the 
multivariate setting the decreasing of the mean square error according £. 



I 025 05 0/75 1 

MSPE 0.4054848 0.4054814 0.4054786 0.4054764 0.4054746 
(1.372965) (1.372930) (1.372896) (1.372863) (1.372831) 

Table 3: Mean and standard deviation of the square prediction error, com- 
puted on 500 repeated simulations, for different values of I. 



3.4 Computational time 

In this subsection, we highlight an important advantage of the recursive es- 
timator compared to the initial one, from |Ferraty and Vieu(2006)| . This 
concerns the gain of computational time in the prediction of the response, 
when new values of the explanatory variable are sequentially added to the 
database. Indeed, when a new observation (X n+ i,Y n+ i) appears, the com- 
putation of the recursive estimator r n+l just asks another iteration of the 
algorithm, by using its value computed with the sequence (Xi,Yi) i=1 , 
while the initial estimator from |Ferraty and Vieu(2006)| must be recom- 
puted on the whole sample {Xi,Yi) i=1 n+1 . This explains the computation 
time difference between both estimators in this kind of situations, as il- 
lustrated in the following. From an initial sample (Xi,Yi)-* with size 
n = 100, we consider N additional observations, for different values of N. 
We compare the cumulated computational times to obtain the recursive and 
the non recursive estimators, when adding these N new observations. The 
characteristics of the computer on which the computations have been done 
are: CPU: Duo £4700 2.60 GHz, HD: 149 Go, Memory: 3.23 Go. The 
simulation is done in the following conditions: the curves X\, . . . , X n , as well 
as the new observations X n+ x, . , . , X n+ N, are standard Brownian motions 
on [0,1], with n = 100 and N G {1,50,100,200,500}. The semi-norm, the 
sequence of bandwidths and the parameter £ are chosen as each particular 
previous case. 

The computational times are collected in Table 4. Here, our estimator 
shows its clear advantage in terms of computational time compared to the 
estimator from Ferraty and Vieu(2006)|. 



3.5 A real dataset example 

In this subsection, we use our estimator in a situation of a real dataset. 
Functional data are particularly adapted when one wants to study a time 
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N 1 50 100 200 500 

comp. time for r [ * l +1 , . . . ,r [ *[ N 0.125 0.484 0.859 1.563 3.656 
comp. time for r n+1 , . . . ,r n+N 0.047 1.922 5.594 21.938 152.719 

Table 4: Cumulated computational times in seconds for the recursive and 
|Ferraty and Vieu(2006)| estimators when adding N new observations, for 
different values of N. 

series. We illustrate this fact with El Nino time series^ which gives the 
monthly sea surface temperature from January, 1982 up to December, 2011 
(360 months) and plotted on Figure 1. From this time series, we extract 
the 30 yearly curves X\, . . . , X30 from 1982 to 2011, discretized into p = 12 
points. These yearly curves are plotted on Figure 2. The observation of the 
variable of interest at a certain month j of the year i is the value of the sea 
temperature Xi + \ for the month j, in other words, for j = 1, . . . , 12 and for 
* = !,..., 29, =X i+1 (j). 




Figure 1: El Nino monthly temperature time series from January, 1982 up 
to December, 2011. 

We predict the values of Y^, • • • > ^29^ ( m °t ner words, the values of the 
curve ^30)- The recursive estimator and the estimator from |Ferraty and Vieu(2 006) 
are computed by choosing the semi-norm, the sequence of bandwidths and 
the parameter I as each particular previous case. 



Available online at http://www.math.univ-toulouse.fr/staph/npfda/ 



12 



months 



Figure 2: El Nino yearly curves temperatures from 1982 up to 2011. 

We analyze the results by computing the mean square prediction error 
over the year 2011, given by 

mspe = ±T(y$-y$ s 2 



12 ^ 



where is computed either with the recursive estimator (result: 0.5719) 



or the estimator from Ferraty and Vieu(2006)| (result: 0.2823). The corre- 



sponding true curve and predicted curves over the year 2011 are plotted on 
Figure 3. The estimator from Ferraty and Vieu(2006)| shows again its ad- 



vantage in terms of prediction, while our estimator behaves quite well and has 
the advantage of computational time as highlighted in previous subsection. 
Here, for the prediction of twelve values (the final year), the computational 
time (in seconds) for our estimator is 0.128 while the computational time for 
the estimator from Ferraty and Vieu(2006)| is 0.487. 



4 Proofs 

Throughout the proofs, we denote by 7j a sequence of real numbers going to 
zero as i tends to oo. The kernel estimate rjf^ can be written as 

r [il M = ^ ] (*) 

fniX) 
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months 



Figure 3: El Nino true and predicted temperature curves for the year 2011. 
The solid line is the true curve. The dashed line is the predicted curve with 
the recursive estimator. The dotted line is the predicted curve with the 
estimator from Ferraty and Vieu Ferraty and Vieu(2006)] . 



where tp$ and f$ are defined in 
4.1 Proof of Theorem [Q 

To prove the first assertion of Theorem[T] we use the following decomposition 



E 



E 




E 





E 



^(X) [/» M (x)-E/M(x)]} 



{e[/» m (x)]} 



+ 



H 


V|f](x) 


/| ] (X)-E/^(X) 


1 


{ 


E 




r 



The first part of Theorem [T] is then a direct consequence of the following 
lemmas. 

Lemma 1 Under assumptions H1-H4, we have 



r( x )=W(0)-^ L ^[l + o(l)] 



E 




E 


/» M (x)' 
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Lemma 2 Under assumptions H1-H4, we have 

E{^(x)[mx)-Ef^(x)}} = O 



1 



E\rW(x) W{x)-EfW{ X ) 



O 



nF{h r , 
1 

nF(h Ti 



Lemma 3 Under assumptions H1-H4, we have 

E f/M ( X ))=M 1 [l + o(l)) and E U$ ( X )) = r( X )M 1 [1 + o(l)) . 



To study the variance term in Theorem[T] we use the following decomposition 
which can be found in |Collomb(1976)] . 



Var 



rRx) 



Var 




1 E 

A 


;^ ] (x); 


Cov 


>» m (x),^(x); 


M 


flhx)_ 


} 2 {e [/«(*)] 


} 3 



+3Var 



/Jf ] (x) 





VL"w]} 2 




[/!?(*>] } 4 



+ 



nF(/i n ) 



(5) 



The second assertion of Theorem [T] follows from ^ and Lemma [4] below. □ 



Lemma 4 Under assumptions H1-H4, we have 
Var 



Cov 



ft\x) 
f$(x),^(x) 



^- 2e] M 2 ^^[l + o(l)]. 



Var 



Pfi-e] 



[r 2 ( X )+o- 2 e (x)]M 2 



nF(h r 



r( X )M 2 



nF(h r , 



Now let us prove Lemmas T]|4 



4.1.1 Proof of Lemma Q] 

Observe that 



r(x) 



E 


~^ ] (x) 


E 


;^ ] (x) 



E 



-E 



(Y; - r(x)) # ^ 



llx-^ill 



i=l 
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Noting that 



E 



{Yi-r(x))K 



\\X-Xi \ 
hi 



E 
E 



\\x-x\\ 

hi 

\\x-x\\ 



(r{X)-r{ X ))K 

v{\\x- x \\)k(^ hi 

!\{h,t)K{t)d?W x -^{t) 
Jo 



a Taylor's expansion of ip around ensures that 



E 



V{\\X-X\\)K 



\\x-x\\ 

hi 



hnp\0) / tK(t)dP\\ x -^ hi (t) + o(hi). 



From the proof of Lemma 2 in Ferraty et al.(2007)| , it follows from H2 and 
Fubini's Theorem that 



tK{t)dP^ x -^ /hl {t) = F(hi) 



K(l)- / (sK(s)yr hi (s)ds 
Jo 



(6) 



and 



EK 



\\x-x\\ 

hi 



K I dP\\ x -*\t) = F(hi) 
o V hi 



K(l)- / K'(s)r hi (s)ds 
Jo 



(7) 



Combining ^ and Q, we have 

EMW-'jVCO) \K(l)-ti(sK(s)yT hi (s)ds\ 

i=1 k L J J 



E 




E 





r(x) 



8=1 



Di 
D 2 



By virtue of H3 we get from Toeplitz's lemma (see |Masry(1986)| ) that 



nh n F(h n ) 



a M¥ /(0)M [l + o(l)], 



nF(/ir 



3^ = J 8 [1 _ fl Mi[l + o(l)], 



and Lemma [T] follows. 

4.1.2 Proof of Lemma [3] 

From 0, we can write 



□ 



E 



/„ M (X) 



E 



A" 



llx-^l 



i=i 



E 

; = 1 



nF(h n y 



K{l)-^K'(s)r hi (s)ds 
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Mi[l + o(l)], 



where the last equality follows from assumptions H3, H4 and Toeplitz's 
lemma. Now, conditioning by X, we have 



E 



YiK 



\X-Xi\ 
hi 



E [r(X) - r( X ) + r(*)] K 



\X-Xt\ 



where 

^i:=E^ r(.V)- r( X )]K 

X-Xi\\ 



\X-Xi 
hi 



< sup \r(x) - r(x)\EK 
X'SS(xA) 



--: Ai + Bi, 



\X-Xi\ 



and Bi := r( X )EK { — 
X-Xi 



E 



YiK 



hi 



Since r is continuous (HI), then 

\x-*i 



[r( X ) + 7i] EA" 



hi 



F{h l )M 1 [r( X ) + ll ]. (8) 



We deduce from ([8]), with the help of assumptions H3 and H4, by applying 
again Toeplitz's lemma, that 



E 



d ] (x) 



1 



1 



:E 



i=l 



that proves Lemma [3] 

4.1.3 Proof of Lemma |4] 

First, notice that as in ([7]), we have 



r(x)Mj [1 + o(l)] , 



□ 



E 



(^ 2 )'( S )r hi ( S )d S 



(9) 



The relation ([7J and assumption H3 ensure that 



E 



if 



*; 2 1 > 



then we get 



Var 



K 



\x-x\ 



We obtain that 



Var 



fHKx) 



i 



M 2 F(hi) [1 + 7f ] . 



Y,F{h l f-^M 2 [l + 



P[l-2i] 1 

P[i-e] nF(h n 
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-M 2 [1 + o(l)l , 



and the first step of Lemma [4] follows. In a similar manner, for the second 
one, we write 



Var 



\i=i 



i=l 



\\X-Xi\ 
hi 



Next, one obtains by conditioning on X, 



E 



y2 K 2(\\X-Xi\ 



E 



r\X)K 2 



\\X-Xi\ 



+ E 



\\X 



Assumption HI and ([9j ensure that 

llx-^l 



E 



Y?K 2 



h; 



K 2 



\\X-Xi\ 



hi 



= [r 2 (x)+a e 2 (x)]E 
= [r 2 ( X ) + ^( x )]M 2 F(/ ii )[l + 7i ], 
and then from Toeplitz's lemma, with H3 and H4, it follows that 

1 



Var 



<P 



L £1 (x) 



i=l 



^P(x)+a 2 (x)]M 2 -^[l + o(l)], 



[i-e\ 



which proves the second assertion of Lemma [4] For the covariance term, this 
can be written as 



Cov 



Ax), Ax) 



i=l 



E 



n n YiK 

EE- 

i=U=i 



EK 



i=l 



E 

i=l 



lx-^jll 
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1 
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E 



l|x-*jl 



Notice that from ([6j and (|8j), we have 

1 



II 



O 



F(hi)» 



1 



nF(/t n ) 
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Next, from assumption HI and conditioning on X, we have 
.2 (\\x - X i\ 



E 



it follows that 



(B n A- 



hi 



M 2 F{h i )[r{ X )+li], 



-2 n 



nF(h n ) f^nF{h n ) 

and the third assertion of Lemma [4] follows again by applying Toeplitz's 
lemma. □ 



4.1.4 Proof of Lemma [2] 

Lemma [2] is a direct consequence of Lemmas [3] and [4j 

4.2 Proof of Theorem [2] 

We have the following decomposition 



□ 



r [ n ] (x) ~r(x) 



<p [ n\x)-r(x)tt\x) , Vn\x)-$\x) 



ti\x) 



tf\x) 



(10) 



where <p$ (x) is a truncated version of ip$ (x) defined by 



1 



2=1 



^ 1 {|v i |<M K 



11) 



b n being a sequence of real numbers which goes to +oo as n — > oo. Next, for 



any e > 0, we have for the residual term of (10) 

^(x)-^(x) 



P 



> £ 



In Inn 

nF{h n ) 



< p [(j i\Yi\ >b n }\ <E 



\i=l 



) < E 


e *l*T" 







n 



where the last inequality follows by setting b n = (8 In n) ? , with the help of 
Markov's inequality. Assumption H5 ensures that for any e > 0, 



> e 



E p \ ^ ] (x)-mx) 

n=l I 

and by Borel-Cantelli's lemma we get 



In Inn 

nF(h n ) 



2 2 
> < oo if 5 > -, 



"nf(/i„)" 


1/2 


In Inn 





a.s, as n — > oo. 



(12) 
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For the principal term in (10), we can write 



$?(x)-r(x)/„ M Cx) = {^(x)-r(x)ff ] (x)-E[^( X )-r(x)ff(x)}} 

+ {E [^(x) - r(x)jf( X )] } := ^ + N 2 . (13) 

Theorem [2] will therefore be completely proved if we show Lemmas [5] and [6] 
below. Indeed, from Lemma we have E ^fn\x)^j = Mi [1 + o(l)] and it 
can be shown as the same lines of the proof of Lemma [5] below that 



/M(x)-E/W(x) = 



/ In Inn 

nF(h n ) 

Lemma 5 Under assumptions HI — H6, we have 



a.s. 



lim 

n— >oo 



nF{h n )Vl\ T [2P {1 _ 2e] o-l( X )M 2 ] 1/2 

— iVi = a.s. 

In In nj 



Lemma 6 Assume that HI — H5 hold. If lim n/i n = 0, then 



lim 

n— >oo 



nF{h n ) 
In In n 



1/2 



iV 2 = o. 



□ 



4.2.1 Proof of Lemma \E\ 

Let us set 

W n ,i 



and define 



' K ( I|X . ) [lil{|y 4 |< 6n} - r(x)] and Z n ^ = W Uji - EW n>i , 



hi 



i=i 



i=l 



5 n = and F n = 2j EZ^. 

Observe that 

n 

K = ^F^r^^Efif 



1=1 



^^-^^[y-Kx)] 2 



+E( /v 2 (^ 7 ^)F[2r(x)-r] f! {5 , 



^1+^2-^3- 



(14) 
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We can write 



Ax = X:^r 2 'E{^ 2 (^ 7 ^)E[(y-r( X )) 2 |A']} 



E 



2 ( \\X-X\ 

hi 



E 



An +^12- 



)2/ 



+ 



2 {\\X-X\\ 



F(h t 



\2(: 



From H2, by applying Fubini's theorem, we have 



K\l)- {K\s))'r hi {s)ds 



o 



i=i 

and from Toeplitz's Lemma, by virtue of H3 and H4, we get 
An 

\i-2l ~^ P[i-2l}V 2 e{x)M 2 , as n-^+oc. 



nF(h r 

For the second term of the decomposition of A\, from ([9j we have 



(15) 



A 12 < Y.nh) 1 - 21 sup \a 2 ( X >) - a 2 ( X )\ 
x'£B( X ,hi) 



i=l 



K 2 (l) 



{K\s))'r hi {s)ds 



The continuity of a 2 (HI) with again Toeplitz's lemma ensure that 

A12 



nF(h n 



,l-2£ 



0, as n — > +oo. 



(16) 



Now, let us study the term A 2 appearing in the decomposition of V n . Using 

Cauchy-Schwartz's inequality, and denoting ll-K^I^ := sup K(t), we get 

te[o,i] 



n 

A 2 < ||Kf 0O ^F(/ li )-^{E(y 2 [2r( X )-y] 2 )p(|F|>6 n )} 
i=i 

n 

< ZQnWKWlY^Fihi)- 21 , 



1=1 



where 



Q n = (max{E (Y 4 ) , 4|r( X )|E|y| 3 , 4r 2 ( X )E (Y 2 )} P (\Y\ > b n )Y . 
We deduce from H4 and H5, again with the choice b n = (dlnn) 1 ^ , that 



A, 



nF(h n 



,l-2£ 



O 



e 2~ (Inn)' 



F{h n ) 



0, as n — ^ +oo with 5 > — . (17) 

A 



21 



Next, for the last term ^3, we have 

n 

\A 3 \<b 2 n [l + (l)]J>(fc) 



2-2£ 



i=l 



1 ]2 

K(l)- / (^'(s))^)^ 




It follows from H6 that 
A3 



nF{h n ) 



1-21 



o 



F(/i n )(lnn> 



0, as n — > +00. 



We deduce from (|15J), (|16j), (|17J) and (|18j) that 

V n ~ nF(/i n ) 1 - 2 ^ [1 _ 2 ^^(x)M 2 , as n -> +00. 



(18) 



(19) 



Next, since ln f^ n ) — >• as n — > +00, then the first part of H6 implies that 



Inn 



nF(h n ) (Inn) '< 



., pp. — > 00, as n ->■ +00. 

ln [nf^l-a] {lnln [nF(h n ) 1 ~ 2t ]f {a+1) 

1 

Setting b n = (d ln n) , it follows that there exists no > 1 such that for any 
i > no, we have 



iF(/i;)(hn) f 



ln [zF(^) 1 - 2 ^] {lnln [iF^) 1 - 2 *]} 2 ^ 



> 



2||i^||^max{|r( X )| 2 ,(51ni)^} 



F(h 



.\2£ 



> 



So, the event < Z 2 2 > 



ln[iF(hi) 1 -^]{lnln[jF(h l ) 1 - M ]} 



MU 2( Q +i) 



jy > is empty for i > uq. 



We deduce from (19) that 



£ illlln . i:) "i:i 1 



i=i 



Vi 



z 2 > 



InVjNnm Vj] v ; 



< OO. 



Let 5 be a random function defined on [0, +00 [ such that for any t € 
[Kj> Ki+i[> S(t) = S n . Using Theorem 3.1 in |Jain et al,(1975)| , there exists 
a Brownian motion £ such that 



s(t)-£(t) 



(2tlnlnt)a 
It follows that 



(mini) 2 a.s., as t — > 00, for any i G [p^, Vn+i[- 



lim 

t— >oo 



S(t) 
(2t mini): 



lim 

t— >oo 



(2tlnlnt)5 (2ilnlni)^ 



1 a.s. 
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and then we have 



v^KhilnK 



1 a.s., as n — > oo, 



(20) 



by virtue of the definition of S and the fact that ^ +1 — > 1 as n — > oo. From 
(|19l), we have 



{uF{h n )^ In In [nF(/ ln ) 1 "^] } 1/2 B n>1 



lim 

n— >oo 



(2KlnlnK 



{2/3[i_2^ £ 2 (x)M 2 } 



1/2- 



Lemma 
with the 



follows from the last convergence and the fact that S n = Ni E Fljii) 1 

i=l 

help of pOl). □ 



4.2.2 Proof of Lemma [6] 

We have 



iV 2 



E my- 1 <=i 
i=i 



lx-#l 



(r(^)-r(x)) 



-E 



if 



lx-*l 

hi 



Yt 



i\y\>b n } 



A + B. 



As in the proof of Lemma [TJ we can write 



A = h n -p^-<p'{Q)M [l + o(l)}, 

P[l-£] 



(21) 



(22) 



and then, 



nF(h r , 



In Inn 



1/2 



.4 



In In n 



1/2 



/i n -^^(0)M o [1 + o(l)] = o(l), 

P[l-f] 



where the last equality follows from the condition nh\ — > 0. For the second 
term of the right-hand-side in (21), using Cauchy-Schwartz's inequality and 
the boundness of the kernel K, we get 

II WW n 

\B\ < n im °° ^F(^)- £ {E^ 2 ]P[|^l>&n]} 1/2 - 

e n**) 1 -* <=i 

i=l 



From Markov's inequality combined with ([1]), it follows that 



I All 



n 



Y,F(h t y-z ,=i 

i=l 

I Bn >- £ Q nn )2/M 



A|y*l» 



1/2 



(23) 



23 



which gives 



nF(h n ) 
In Inn 



1/2 



B 



O 



n l - XS (Inn) 



\/ln In n \J nF(h n ) 



2/M 



and Lemma [6] is proved. 



o(l) if 6 > -, 



□ 



4.3 Proof of Theorem [3] 

Using the decomposition (flOl), we have to show that 



V^K)\^(x)-^(x) 



a.s. 



(24) 



and 

VnF(h r , 



t ^ i \Mt \1 » A r^ cM o^(°)«M %-2f]^20" £ 2 (x)\ 



where is defined in (11) and c is such that lim WnF(h n )h n = c, since 

' ■ n— >oo 

fn\x) ~^ M\. This later follows from the first parts of Lemmas [3] and [4] 
For (24), following the same lines of proof of (12) with substituting 
In Inn 



1/2 



by 



JnF(h n ) 



gives the desired results. About (25), using the 



nF(h n ) 

decomposition (13), it remains to prove Lemmas [7] and [8] below. 
Lemma 7 Assume that Assumptions HI — H5 and H7 hold. Then 



V^KjN! % M (0, alix)M 2 \ • 



Lemma 8 Assume that Assumptions HI — H5 hold. If there exists c > 
such that lim h n \J nF(h n ) = c, then 



lim y/nF(hn)N 2 = c—^-ip'(0)M o . 



4.3.1 Proof of Lemma [7] 

Setting 



K,i = ^^^=iWn,i and Z'^ = W^-EW^, 
where W n j is defined in the proof of Theorem |2j then 

n 

y/nF(h n )N 1 = J2z' n ,i- 
i=i 



24 



To prove Lemma [7j we first prove that 



lim ^E(Z^) = ^-aUx)M 2 , (26) 



P[i~2e\ 
~B 2 

and then check that W' n i satisfies the Lyapounov's condition. Next, from 

n F(h n ) v = _ 1 1 

a 2 £ ( X )M 2 [l + o(l)] 



(19) we have 



P[l-2l\ 2 , 



which proves (26). To check the Lyapounov's condition, set p > 2, we have 

n n 

E E (KiH = E E d^rXi)- 



t=i i=i 

Since 



it follows that 



{nF{h n ))l t F(h t )-^Var (k {Yil m < K } ~ Kx))) 



i=l 



2 2 -p \\K\fe p \b n - r( X )\ 2 - p (t Fihi) 1 -^ 



Using the same decomposition as in (14), we have 



£ F(h^Var (K (^=^) (^ { «|<M - r(x))) 
= E^(^|E(^(fc^)[y-r(x)] 2 ) 

+ K^ 2 ( k ^) y[2r(x) " y]i{|y|>M )} 

"A. .-Bi + B a -ft. (27) 
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Setting b n = (<51nn)^ for some 5, \i > and following the same lines as in 
the proof of (15), (16), (17) and (18) with substituting the exponent 2 by p 
in all the expressions, we have 

B x = O (nF{h n f-^ , 

so that from Toeplitz's lemma, we can write 



(nF(h n ))2 2 



o 



(Inn) 



p-2N 



oil). 



Next, for the second expression B2 of (|27|), we get 



Bo 



nF(h n y-P e 



o 



exp 



^) (Inn)? 



F(/*r 



o(l) with 5 > 



It follows again from Toeplitz's lemma that 

(nF(h n ))2 2 
TiZ^p Pn-r(x)r #2 = O 



(Inn) 



p-2\ 



VnF(h n ) 



o(l). 



In the same manner from (18), we have 

£3 



nF(h n y-pt 







F{h n )(\nn)» 



so that 



which concludes the proof of Lemma [7j 
4.3.2 Proof of Lemma [8] 



(Inn) 



p-2\ 



y/nF(hr. 



0(1) 



□ 



We go back to the decomposition of (21 ) in the proof of lemma |6| 
On one hand, from (22), we write 



y/nF(h n )A = ^nF{h n )h n ^^ip' '(0)M [1 + o(l)\ . 

P[i-e] 



On the other hand from (23), we get 



yJnF{h n )B = O 



\JnF{h r 



o(l) if 5> 



1 



and Lemma [8] follows from the combination of (28) and (29). 



(28) 

(29) 
□ 
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